A New Method for Ranking of Word Hypotheses Generated from OCR: The Application on the Arabic Word Recognition

نویسندگان

  • Houda Gaddour
  • Hanêne Guesmi
  • Fouad Slimane
  • Slim Kanoun
  • Jean Hennebert
چکیده

In this paper, we propose a new method for the best ranking of OCR word hypotheses in order to increase the chances that the correct hypothesis will be ranked in the first position. This method is based on the images construction of the OCR word hypotheses and the calculation of the dissimilarity scores between these last constructed images and the image to recognize. To evaluate the new proposed method, we compare them with a classic method which is based on the ranking of OCR word hypotheses under the recognition process. The experimental results of these two methods on the database of 1000 word images show that the new proposed method led to the best ranking of OCR word hypotheses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تشخیص دست‌نوشتۀ‌ برخط فارسی با استفاده از مدل زبانی و کاهش قوانین نگارش کاربر

The Joint-up, cursive form of Persian words and immense variety of its scripts, also different figures of Persian letters depending on their sitting positions in the words, have turned the Persian handwritings recognition to an intense challenge. The major obstacle of the most often recognition ways, is their inattention to sentence contexture which causes utilizing of a word with correct appea...

متن کامل

Novel Arabic OCR Degraded Text Retrieval Model

This paper provides a novel model enhances the Arabic OCR degraded text retrieval effectiveness. The model simulates the Arabic OCR recognition mistakes happens while the recognition process based on word based approach. Then using the expected OCR errors the model expands the user search query. The resulting expanded search query produced higher precision and recall in searching Arabic OCRDegr...

متن کامل

Persian/Arabic Baffletext CAPTCHA

Nowadays, many daily human activities such as education, trade, talks, etc are done by using the Internet. In such things as registration on Internet web sites, hackers write programs to make automatic false registration that waste the resources of the web sites while it may also stop it from functioning. Therefore, human users should be distinguished from computer programs. To this end, this p...

متن کامل

یک روش دو مرحلهای برای بازشناسی کلمات دستنوشته فارسی به کمک بلوکبندی تطبیقی گرادیان تصویر

This paper presented a two step method for offline handwritten Farsi word recognition. In first step, in order to improve the recognition accuracy and speed, an algorithm proposed for initial eliminating lexicon entries unlikely to match the input image. For lexicon reduction, the words of lexicon are clustered using ISOCLUS and Hierarchal clustering algorithm. Clustering is based on the featur...

متن کامل

Arabic Character Recognition using Approximate Stroke Sequence

Arabic character recognition of handwriting is addressed. A novel approach for the Arabic Character Recognition is presented based on statistical analysis of a typical Arabic text is presented. Results showed that the sub-word in Arabic language is the basic pictorial block rather than the word. The method of approximate stroke sequence is applied for the recognition of some Arabic characters i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011